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AMENDMENTS TO THE CLAIMS: 

This listing of claims will replace all prior versions, and listings, of claims in the 
application: 

LISTING OF CLAIMS: 

Claims 1-18 (Canceled). 

19. (Previously Presented) A system for indexing textual content in any of 
a plurality of languages for searching purposes, comprising: 

a tokenizer which separates a string of text into individual word tokens; 

a stemmer which reduces the word tokens to grammatical stems by removing 
word endings which are associated with any one or more of the languages, without 
regard to whether the remaining stem is a recognized word in any combination of the 
plurality of languages; and 

an index which stores the stems in an index. 

20. (Previously Presented) The system of claim 19 wherein the word 
endings which are removed are limited to only those endings which are associated 
with nouns. 

21 . (Previously Presented) The system of claim 19 wherein a word ending 
is not removed if the resulting stem is less than a predetermined length. 



Attorne/s Docket No. P2272C2-000942 
Application No. 10/612,936 

Page 3 

22. (Previously Presented) The system of claim 21 wherein said 
predetermined length is four characters. 

23. (Previously Presented) The system of claim 19 wherein the stemmer 
reduces only once per word token. 

24. (Previously Presented) The system of claim 23 wherein said stemmer 
reduces by first examining each word token for the longest known endings, and 
examining the token for successively shorter endings until a known ending is 
identified in the word token and removed. 

25. (Previously Presented) The system of claim 19 wherein the stemmer 
and index disregard stopwords, wherein stopwords are words which occur with 
relatively high frequency in at least one of said languages and which are not also 
significant nouns in another one of said languages. 

26. (Previously Presented) A computer-readable medium containing a 
computer program for searching for documents that may contain text in any of a 
plurality of languages, wherein the computer program performs the steps of: 

separating text in each document to be searched into individual word tokens; 
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reducing the word tokens to grammatical stems by removing word endings 
that are associated with any one or more of the languages, without regard to whether 
the remaining stem is a recognized word in any of the plurality of languages; 

storing the stems in an index that identifies the documents in which words 
containing the stems appeared; 

parsing a query containing a string of text into individual word tokens; 

reducing the word tokens from the query to grammatical stems by removing 
word endings that are associated with any one or more of the languages, without 
regard to whether the remaining stem is a recognized word in any of the plurality of 
languages; 

searching the index for entries that match the stems obtained from the query; 

and 

displaying an identification of the documents that contained matching entries. 

27. (Previously Presented) The computer-readable medium of claim 26, 
wherein the computer program performs the step of: 

displaying a matching entry along with the identification of the document in 
which it appears, wherein a stem is displayed together with an ending to present a 
full word to the user. 
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28. (Previously Presented) The computer-readable medium of claim 26, 
wherein a stem is stored in the index together with the ending that was removed 
from a word token to form that stem, and an entry in the index that matches a stem 
from a query is displayed with the stored ending. 
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29. (Previously Presented) A system for searching for documents which 

may contain text in any of a plurality of languages, comprising: 

a tokenizer which separates text in each document to be searched into 

individual word tokens; 

a stemmer which reduces the word tokens to grammatical stems by removing 

word endings which are associated with any one or more of the languages, without 

regard to whether the remaining stem is a recognized word in any of the plurality of 

languages; 

an indexer which stores the stems in an index and which identifies the 
documents in which words containing the stems appeared; 

a search engine which parses a query containing a string of text into individual 
word tokens, reduces the word tokens from said query to grammatical stems by 
removing word endings which are associated with any one or more of the languages, 
without regard to whether the remaining stem is a recognized word in any of the 
plurality of languages, and searches the index for entries which match the stems 
obtained from said query; and 

a display which displays an identification of the documents which contained 
matching entries. 

30. (Previously Presented) The system of claim 29 wherein the display 
displays a matching entry along with the identification of the document in which it 
appears, and wherein a stem is displayed together with an ending to present a full 
word to the user. 
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31 . (Previously Presented) The system of claim 29 wherein a stem is 
stored in said index together with the ending that was removed from a word token to 
form that stem, and an entry in the index that matches a stem from a query is 
displayed with said stored ending. 

32. (New) A system for determining a relevance ranking for documents 
that may contain text in any of a plurality of languages, comprising: 

a tokenizer that parses a string of text in a received query into individual word 
tokens; 

a stemmer that reduces the word tokens from the query to grammatical stems 
by removing word endings that are associated with any one or more of the 
languages, without regard to whether the remaining stem is a recognized word in 
any of the plurality of languages; 

a search engine that searches an index for entries that match the stems 
obtained from the query, wherein the index identifies the documents in which words 
containing the stems appeared, retrieves a summary for each document identified as 
containing matching entries, separates text in each summary into individual word 
tokens, reduces the word tokens from each summary to grammatical stems by 
removing word endings that are associated with any one or more of the languages, 
without regard to whether the remaining stem is a recognized word in any 
combination of the plurality of languages, and compares the stems obtained from the 
query with the stems obtained from each summary to determine the relevance 
ranking for each document identified as containing matching entries; and 

a display that displays said relevance rankings. 
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33. (New) The system of claim 32, wherein said display displays an 
identification of the documents that contained matching entries, in order of relevance 
ranking. 

34. (New) The system of claim 33, wherein said display displays a 
matching entry along with the identification of the document in which it appears, 
wherein a stem is displayed together with an ending to present a full word to the 
user. 

35. (New) The system of claim 33, wherein a stem is stored in the index 
together with the ending that was removed from a word token to form that stem, and 
an entry in the index that matches a stem from a query is displayed with the stored 
ending. 

36. (New) The system of claim 32, wherein the word endings that are 
removed are limited to those ending that are associated with nouns. 

37. (New) The system of claim 32, wherein a word ending is not removed 
if the resulting stem is less than a predetermined length. 

38. (New) The system of claim 37, wherein the predetermined length is 
four characters. 
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39. (New) The system of claim 32, wherein the stemmer performs the 
reducing step once per word token. 

40. (New) The system of claim 39, wherein the stemmer performs the 
reducing step by first examining each word token for the longest known endings, and 
examining the token for successively shorter endings until a known ending is 
identified in the word token and removed. 

41 . (New) A system for determining a relevance ranking for documents 
that may contain text in any of a plurality of languages, comprising: 

a tokenizer that parses the string of text in a received query into individual 
word tokens; 

a stemmer that reduces the word tokens from the query to grammatical stems 
by removing word endings that are associated with any one or more of the 
languages, without regard to whether the remaining stem is a recognized word in 
any of the plurality of languages; 

a search engine that searches an index for entries that match the stems 
obtained from the query, wherein the index identifies the documents in which words 
containing the stems appeared, separates, into individual word tokens, text in each 
document identified as containing matching entries, reduces the word tokens to 
grammatical stems by removing word endings that are associated with any one or 
more of the languages, without regard to whether the remaining stem is a recognized 
word in any of the plurality of languages, and compares the stems obtained from the 
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query with the stems obtained from each document identified as containing matching 
entries to determine the relevance ranking for each identified document; and 
a display that displays said relevance rankings. 

42. (New) The system of claim 41 , wherein said display displays an 
identification of the documents that contained matching entries, in order of relevance 
ranking. 

43. (New) The system of claim 42, wherein said display displays a 
matching entry along with the identification of the document in which it appears, 
wherein a stem is displayed together with an ending to present a full word to the 
user. 

44. (New) The system of claim 42, wherein a stem is stored in the index 
together with the ending that was removed from a word token to form that stem, and 
an entry in the index that matches a stem from a query is displayed with the stored 
ending. 

45. (New) The system of claim 41 , wherein the word endings that are 
removed are limited to those ending that are associated with nouns. 

46. (New) The system of claim 41 , wherein a word ending is not removed 
if the resulting stem is less than a predetermined length. 
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47. (New) The system of claim 46, wherein the predetermined length is 
four characters. 

48. (New) The system of claim 41 , wherein the stemmer performs the 
reducing step once per word token. 

49. (New) The system of claim 48, wherein the stemmer performs the 
reducing step by first examining each word token for the longest known endings, and 
examining the token for successively shorter endings until a known ending is 
identified in the word token and removed. 



